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6 Abstract. In order to estimate model parameters and circumvent possible dif- 

7 Acuities encountered with the likelihood function, we propose to replace the like- 

> , 

O , 8 lihood in the formula of the posterior distribution by a function depending on a 

00 

^ I 10 MAP estimator are studied to understand what the consequences of incorporat- 

00 

^ I 11 ing a contrast in the Bayesian formula are. We show that the proposed method 

12 can be used to make frequentist inference and allows the reduction of analytical 

' 13 calculations to get the limit variance matrix of the estimator. For specific con- 

ed ] 

14 trasts, the CB-posterior distribution directly approximates the limit distribution 

15 of the estimator; the calculation of the limit variance matrix is then avoided. 
Moreover, for these contrasts, the CB-posterior distribution can also be used to 



9 contrast. The properties of the contrast-based (CB) posterior distribution and 
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1 make inference in the Bayesian way. The method is apphed to three spatial data 

2 sets. 

3 Key words. Frequentist estimation; Quasi-Bayesian estimation; Spatial mod- 

4 els. 



1 Introduction 



In both the frequentist and the Bayesian viewpoints, the likelihood function 
has become the major component of statistical inference under a parametric 
model. Its use, however, has drawbacks in specific situations. First, it may be im- 



9 possible to write do wn the likelihood in a numerica 



10 of Boolean mo dels (jVan Lieshout and Van Zweit 



11 flM^Uerl . 



20031), Markov spatial proc esses (IGuvon 



linear mixed models (spatial GLMM; 



ly tra ctable form; see the cases 



Diggle et al 



20011) . Markov point processes 



19851) and spatial generalized 



19981 ) where multiple integrals 



cannot be reduced due to spatial dependences. Second, the likelihood may not 
be completely appropriate because of the associated assumptions. For instance, 
the likelihood is built under an assumption on the distribution of data, but such 
an assumption may b e tricky to specify in case of insuffi cient information as in 



classic al geostatistics (I Chiles and Delfiner 



(11989 



19991 ): see also 



McCuUagh and Nelder 



chap. 9). In the same vein, every data are assumed to have the same 



19 weights in the l ikelihood, but th e influence of outliers may be too large according 



to the analyst (IMarkatou 



2000). 



The difficulties encountered with the likelihood can be circumvented with 
existing Bayesian and frequentist procedures. 

• There are procedures which use conditional simulation to numerically ap- 
proximate the li kelihood. For instance, the Markov chain Monte Carlo algo- 



rithm (MCMC; 



Robert and Casellal . 



19991), for example, allows the approx- 



imation of the posterior distribution for Markov point processes (iMeiller 



20031) and spatial GLMMs flDiggle et all Il998h. The Markov ch ain expec 



tation maximization algorithm (MCEM; 



Wei and Tannerl. 



maxi mization of the likeliho od for Boolean models (IVan Lieshout and Van Zweit 



1990h allows the 



2OOII) and spatial GLMMs (jZhang . 



2002) 



There are procedures where the likelihood function is simplified or re- 
placed. For example, the pseudo-likelihood, which only accounts for lo- 
cal dependence structures, is used instead of the likelihoo d for Markov 



point process es (iMolleii . 



Guyon 



20031 ) and Markov spatial processes (IBesag 



1975 



19851 ). The generalized least squares estimation, which does not 



rely on assumptions on th e distribution o: 



Chiles and Delfineii (119991 . chap. 2-3) and 



data, is use d in geostatistics; see 



Steinl fl999 



chap. 1). Other pro- 



cedu res belongi r ig to this category are: the weighted likelihood m aximiza- 



tion (iMarkatoul . 



2000), the method of moments, the M-estimation (jSerflingl. 



2OO2I ) , the approximate Bayesian cor nputation (ABC; 



the quasi-likelihood maximization 



quasi-Bayesian likelihood method (ILin 



Beaumont et al 



(IMcCullagh and Neldeii . 



20061 1. 



2OO2I), 



19891) and the 



18 In the quasi-Bayesian likelihood approach, the likelihood appearing in the 

19 posterior distribution formula is replaced by a quasi-likelihood which does not rely 

20 on distribution assumptions. Then, the posterior distribution which is obtained is 

21 used to make inference as in classical Bayesian situations. In this communication 

22 we propose to generalize this approach: the likelihood in the posterior distribution 

23 formula is replaced by a function of a contrast. 

24 A contrast is a function of the model par ameters and the observed dat a whic h 



i3, 



1982). 



25 is minimized to estimate the parameters (iDacunha-Castelle and Dufli 

26 The minimum contrast approach is a generic estimation method which was de- 

27 veloped in a frequentist perspective. The maximum likelihood estimation as well 
2B as the maximum pseudo, weighted or quasi likelihood estimation, the diverse 
1 least squares methods, the method of moments and the M-estimation can be 



formulated as minimum contrast estimation problems. 

Thus, the procedure which is proposed — replacing the likelihood by a func- 
tion of a contrast in the Bayesian formula — includes the classical Bayesian ap- 
proach (here and thereaft er, "classic al" refers to "likelihood-based") and the 



quasi-Bayesian approach of 



LinI (120061 ). This procedure provides a contrast-based 



(CB) posterior distribution which does not coincide, in the general case, with 
the classical posterior distribution. In this paper, we investigate what are the 
posterior distribution and the MAP (maximum a posteriori) estimator based on 
a contrast. 

Under mild conditions on the prior distribution, we show that the CB-MAP 
estimator inherits the asymptotic properties (consistency and asymptotic normal- 
ity) of the minimum contrast estimator, as the classical MAP estimator inherits 



the a symptotic properties of the maximum likelihood estimator (jCaillot and Martin 



I972I ). The limit variance matrix of the normalized estimator is Iq ^Pg/g ^ where 



Tq is the limit variance of the gradient of the contrast and I0 is the limit Hessian 
matrix of the contrast. 

Moreover, we show that the CB-posterior distribution is asymptotically equiv- 
alent to a normal distribution whose variance matrix is Ig^. Therefore, when 
building the contrast, particular attention must be paid to satisfy, if possible, 
Ig^Tglg^ = Ig^. ludced, with such a contrast, inference can be made without 
computing matrices Tg and Ig: the posterior distribution can either be used as 
a limit distribution in a frequentist viewpoint or be used to make inference in 
the Bayesian way. When building a contrast satisfying Ig^Tglg^ = 1^^ is not 
possible, the CB-posterior distribution can nevertheless be used to estimate Iq^ . 
Thus, the computation of the limit Hessian matrix of the contrast is avoided. 

To summarize, the present study shows the consequences of replacing the 
likelihood by a function of a contrast. It also provides an estimation method 
which has advantages over existing methods exploited to circumvent difficulties 
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2 encountered with the hkehhood. First, it does not require a simulation-based 

3 algorithm as the MCMC, MCEM or ABC algorithms. Second, it inherits the 

4 richness of the minimum contrast approach (there are many types of contrast: 

5 likelihood, least squares, moments...). Third, compared to the classical contrast 

6 method, the computation of the derivatives of the contrast is limited. Fourth, 

7 when Iq^TqIq^ = I^^, the CB-posterior distribution can be directly used to 

8 make inference either in the frequentist perspective or in the Bayesian perspec- 

9 tive. However, the method which is proposed has also drawbacks. In particular, 

10 building a contrast which exploits a large part of the information in the data, 

11 as the likelihood does, is not obvious. Besides, building a contrast satisfying 

12 Iq^TqIq^ = Iq^ asks analytical work which can be time consuming. Further- 

13 more, obtaining such a contrast is not always possible. 

14 The article is organized as follows. The classical minimum contrast method 

15 of estimation is recalled in section [2] and examples are given. The method that 

16 we propose is presented in section [3l and its properties are derived. Then, the 

1 method is applied in section H] to simulated and real cases dealing with spatial 

2 statistics (estimation of the range parameter of a variogram; estimation of the 

3 parameters of a Markovian spatial process; and estimation of the parameters of 

4 an autosimilar model used to describe soil roughness). The three cases illustrate 

5 the application of the method when the parameter has one or several components 

6 and when Ig^Tglg^ is equal to or different from Ig^. 
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7 2 Recall: Classical minimum contrast estima- 

8 tion 



2.1 Estimator and asymptotic properties 



Detailed information on minimum contrast estimation can be found in iDacunha-Castelle and Duflc 



(119821 ). Here, we avoid the complete notations. Consider a family of parametric 
models {P^ '■ a G 6} and samples of increasing sizes t G T C N, drawn from Pg. 
A contrast for ^ is a random function a ^ Ut{a) defined over 6, depending on a 
sample of size t, and such that {Ut{a)}t converges in probability, as t — > oo, to 
a function a i— > K{a, 9) which has a strict minimum at a = 6*. The minimum 
contrast estimator is 

6t = aigmm{Ut{a) , a G 6}. 
Let us make the following classical assumptions: 

Hi : C RP, p < oo, is compact and 6 is in the interior of G, 
H2 : a 1-^ K{a, 6) has a strict minimum at 9, 
H3 : a 1-^ Ut{a) is (it has two continuous derivatives) over 0, 
H4 : the normalized gradient vector -\/tgrad[/f(^) (first derivatives of Ut{0) with 
respect to 9) converges in law to the normal distribution A/'(0, Tq): 

VigradUtie) U (0, Ve) in law as t ^ 00, 

H5 : the Hessian matrix tlUtiO) (second derivatives of Ut{0) with respect to 9) 
converges in probability to an invertible matrix Ig: 

iiUt{9) Ig in probability as t — 00, 

Hg : sup lUkiUti^O + P) — iikiUt{9)\ — > in probability, where e > and Hm is 

ll/3||<e 

the component {k,l), 1 < k,l < p, of the Hessian operator. 
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7 Under these assumptions, the minimum contrast estimator is consistent and 

8 asymptotically normal: as t — oo, 

9 • 9t converges in probability to 9 and 

1 • \/i{9t — 9) converges in law to the Gaussian distribution J\f (O, Ig^TglQ^). 

2 2.2 Examples 

Maximum likelihood. Consider an i.i.d. sample (Xj)i<j<„ (here, T = N), 
each element being drawn from the density pe{.). The likelihood function is 
= YloKiKnPdi-^i) Corresponding contrast is 



Un{a) = -- V'logp«(Xi). 



n 

t<n 



The limit function K is the opposite of the Kullback information: K{a, 9) = 
—Eg{\ogpa{Xi)}, the matrices Iq et Tq satisfy 

If) = Tg = Ee[ grad0{logp0(Xj)} gradg{logpe(Xi)}' ] , 

3 and the convergence in law simplifies into y/n(9n — 9) ^ J\f (O, Iq^) . 

A Least squares. Here we present the least-square method as a contrast method 

5 in the case of the estimation of a variogram. This case will be used as an illus- 

6 tration in the application section. 

Consider a stationary Gaussian random field X over with mean value zero 
and with parametric variogram 7 fl(/i) = Eo{{Xi — X ., ]^|, w here h = d{i,j) is 



the distance between Xi and Xj (jChiles and Delfiner 



19991). Assume that the 
sample is made on a square grid {i = {ii,i2) '■ < ii,i2 < n} with size n^; the 
sample is denoted by (Xj)o<ii,j2<ri where i = (ii, ^2) (here T = : n £ Nj). The 



vario gram can be estimated with the least square method (IChiles and Delfiner 



I999I ). In practice, the sample variogram 7 is computed at each possible distance 
hi (/ < k) between points: ^{hi) = 2^ j)GCi("^« ~ ^i)^? where Ci is the set 

7 



of pairs of points separated by hi and ni = i^Ci, and the contrast between the 
sample variogram and the theoretical variogram 

t/„.(a) = i5^{7(/^z)-7.(/^z)}' (1) 

l<k 

7 is minimized. The limit function K of the contrast is K{a, 0) = ^ J^Kki'^di^i) ~ 

8 7o(^z)}^- In this context, the sample variogram {'j{hi)}i<k is unbiased with mean 

9 fiQ = {■yg{hi)}i<k and n{^{hi) —'ye{hi)}i^k is asymptotically normal with variance 
1 matrix denoted by Sg. It follows that n{9n — 6) ^ A/'(0, Iq^VqIq^) where the 



component (i, j) of Vq is ^T,^^, the component of Ig is 
is the transpose of fig. 



and fi'g 



4 Pseudo-likelihood. Here we present the pseudo-likelihood method as a con- 

5 trast method in the case of the estimation of the parameters of a Markov random 

6 field. This case will be used as an illustration in the application section. 

Consider a stationary Markov random field X over with state space {0, 1}. 
Assume that the conditional probability of Xi given Xj, j ^ i, satisfies 

Pg{X, I X„J ^ I) =Pg{X, I X„ J G V{l)) 

_ exp(^iX, + ^2E,w»^^^.) 

{l + exp(^l+e2E,W«^.)}' 

where Q = (^i, Q')) is a pair o f parameters and Vii) is the set of the four nearest 
neighbors of i (iGuyonl . Il985l ). We assume in the following that the Markov field 
is a-mixing; this is satisfied if | Q2 |< 1 for example. Moreover, the field is 
observed on the square grid X = {i = {i\^i2) : < ^1,22 ^ "^j with size (here 
T = {v? : n G N}). The likelihood cannot be analytically c alculated. Th erefore, 



19851 ). The 



a pseudo-likelihood was proposed to make the inference ( iGuyon 
pseudo-likelihood is the product of the conditional probabilities niei-^^("^' 
Xj, j 7^ I). The corresponding contrast is 

UnM) = -^$^logP„(X, I X„j G V{z)). 



(2) 



Let W denote the set of possible states for the neighborhood of any point 0, then 
the hmit function of the contrast is 

K{a,e) = -Yl Yj ^ogPo,{x\Xi^Wi,ieV{O)}Pe{x\Xi = Wi,ieV{O)}P0{w). 

wew x&{o,i} 

Moreover, 71^-9) X(0, le^Telg^) where h = var(Zo), Te = Mo+AJ2o<n,i,<2 ^i, 



Mi — cov(Zo, Zi), i & I, and vectors Zj satisfy 

exp (^9i + 02 ^jev(i) ^j) 



Xi- 



1 + exp (e^ + 02 Ejevii) X,) I VE.eyw 



3 Incorporating a contrast in the Bayesian for- 
mula 

3.1 Posterior distribution and MAP estimator based on 
a contrast 

In the Bayesian framework, a prior distribution denoted c(-) is defined over 
©. Let {Xi)i<t be a sample of size t drawn from the distribution Pg, then the 
posterior distribution is 

Pe{X„i<t)c{e) 



p{e\Xi,i<t) 



Jq Pa{Xi, i < t)c{a)da 
exp{-tUti9))cie) 



Jq cxp{—tUt{a))c{a)da 

5 where Pg{Xi,i < t) is the likelihood and Ut{a) — —jlog Pa{Xi,i < t) is the 

6 corresponding contrast (see the first example presented above). 

For the estimation of ^, we propose to replace the contrast associated with 
the likelihood in the Bayesian formula written above by any contrast. We obtain 
a contrast-based (CB) posterior distribution denoted Pticx): 

^ ex.p{-tUt{a))c{a) 
9 



The CB-MAP estimator obtained by maximizing pt{-) is denoted 

9t = a.Tgma.x{pt{a) , a E 0}. 

6t is at the minimum of a i— Ut{a) — (1/t) log c(q;), and does not coincide in the 
general case with the classical minimum contrast estimator 6t = a.rgmm{Ut{a) , a G 
0}. 

In what follows we investigate the behavior of the CB-MAP estimator and 
the CB-posterior distribution. 

3.2 Consistency and asymptotic normality of the CB- 
MAP estimator 

We noted above that the CB-MAP estimator 6t is at the minimum of a i— *• 
Ut{a) — (l/t) log c(a;). This function satisfies the definition of a contrast. Conse- 
quently, convergence properties of 9t can be easily obtained by using the contrast 
theory. Assume that the hypotheses listed in section [2] are satisfied. Let us assume 
in addition that the prior distribution c(-) is differentiable and strictly positive 
over 0. It can be shown that, as t ^ oo, 

• 6t converges in probability to 6 and 

• \/i{9t — 0) converges in law to the Gaussian distribution J\f (O, Iq^TqIq^), 

where 1$ and Tg are the matrices which were introduced when the classical min- 
imum contrast method was presented: 

liUt{6) le in probability as t — ^ oo 
Vtg,vaidUt{d) M (0, Fe) in law. 
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2 3.3 Asymptotic deviation between 6t and 6t 

The asymptotic deviation between the classical minimum contrast estimator 
9t and the CB-MAP estimator 9t is given by 

tc{e) ^ y ) 

3 where Ip is the unit vector of size p (the dimension of O). Thus, the deviation 

4 between the two estimators is of order 1/t. 

Proof of (jl]). As 6t satisfies gradpt{6t) = 0, 

= -tciet)grsidUt{et) + gradciOt). 

Then, applying a first order Taylor's expansion for gradUt{6t) around 6t yields 

= -tc{et){grsidUt{et) + {UUMM - + Oproba(l)) + gradciOt). 

In this equation, gradUt{9t) = because 9t is the maximizer of Ut{-). Moreover, 
applying zero order Taylor's expansions for c{9t), iiUt{9t) and gradc{9t) around 
9 yields 

= - tc{9){HUt{9)){9t - 9t){l + Oproba(l)) + gradc{9) 
= - tc{9)Ie{9t - 9t){l + Op,oba(l)) + gradc(^), 

5 since \\mt^oo^Ut{9) = Ig in probability. Then equation (jl]) follows. 

6 3.4 Convergence of the CB— posterior distribution 

The CB-posterior distribution pt{-) is asymptotically equivalent to the density 
function of the Gaussian distribution Af i9t, [tle]^^ ) : 



,^00 (27r)^/^|(!/,)-i|V2 ^^P (-^(" - - Ot)) ■ (5) 
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7 See the end of the section for the proof. This result allows us to figure out what 

8 is the CB-posterior distribution and how it can be used to make inference in the 

9 frequentist and Bayesian ways. 

10 In the contrast theory, the distribution A/" (^9t, (t/)^ ^Fe/g"^ j is used to make 

11 frequentist inference about 6: the point estimator is 6t, and confidence zones 

12 are provided based on the this normal distribution. Consequently, if the con- 
is trast is such that Iq^TqIq^ = I^-^, then the CB-posterior distribution pt{-) which 

14 approximates the density of Af {Of, (th)^^^ can be directly used to make fre- 

15 quentist inference about 6: the mode of pt{-) is the point estimator, and con- 

16 fidence zones can be directly determined from pt{-)- This case is particularly 

17 interesting since the calculation of the limit matrices le = lim^^oo Ht/t(0) and 

18 Tg = lim^^oo Ve(-\/tgradf/t(^^)) is not required. 

19 Moreover, when the contrast which is considered satisfies Iq^VqIq^ = Ig^, 

20 we propose to use the CB-posterior distribution pt{-) to make inference in the 

21 Bayesian way, i.e. to use pt{-) as a real posterior density. The motivation is based 

22 on the following analogy: when the contrast corresponding to the likelihood is 

23 employed (in this case, Ig^TglQ^ = Iq^), then pt{-) can be used (i) to make 

24 frequentist inference since it is an approximation of the limit distribution of the 

25 estimator (see above) and (ii) to make Bayesian inference since it is the classical 

26 posterior density. It has to be noted that, in the general case, the CB-posterior 

27 density pt{-) does not coincide with the classical posterior density. It is a posterior 

28 density based on the information brought by the contrast under consideration. 

1 If the contrast does not satisfy Iq^VqIq^ = Ig^, then the CB-posterior dis- 

2 tribution pt{-) cannot be used to approximate the limit distribution of 6t or to 

3 make Bayesian inference. However, pt{-) can be used to estimate the matrix Ig, so 

4 avoiding the calculation of the second derivatives of the contrast. Indeed, one can 

5 see from that an estimate of Ig is the matrix Q~^/t where Q is the variance 

6 matrix of the normal density function centered around 9t and fitted to pt(-). If 6' 
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is real, le can be more simply estimated by 27rpt{9ty/t since equation ([5]) yields 
Pt{dt) ~ {th/'^T^Y^'^ ■ We have not found an equivalent way to easily estimate 

t— >oo 

without analytical calculation of the second derivatives and without simulations. 

Proof of ([5]). Let 5 > 0. For any a such that supi<j<p \ai\ < t^, a third order 
Taylor's expansion yields 

hgptiOt + a/Vt) - \ogpt{et) = -Vta'graidUtiet) - ^a'ha + Oproba(t'' + t^'-'^^). 

Given that gradUt{6t) = (definition of the classical minimum contrast estimator 
9t) and that 9t — 9t = Oproba(i~^'*''')lp (see eq. (jlj)), the previous equation becomes 

logptiOt + a/Vt) - \ogptiet) = -^a'lea + Oproba(t'^ + t^^-^^^). 

Ensuring that 5 < 1/2 (and not only 6 > 0), then 

logpt{9t + a/Vi) - \ogpt{9t) = -^a'lea + Oprohs^it^^) 

= -^a'lga {1 + Oproba(l)}- 

Let us introduce gt-.a^ t-P/^Pt{9t + a/y/i) defined over W. This density func- 
tion satisfies, from the previous result, 

gt{a) ~ rP^^pt{9t)exp ( -l-aleo] . 

t-+oo V ^ / 

Since gt{-) is a density function and given the form of the right-hand-side term of 
this equation, gt{-) is equivalent to the density function of the normal law with 
variance matrix Ig^. Equation ([5]) is then obtained with the change of variable 
a = 9t + a/ y/t. 

3.5 Summary: making inference with the CB— posterior 
distribution 

For any contrast, a point estimator of 9 is at the mode of the CB-posterior 
distribution pt{-)- Moreover, if Iq^VqIq^ = I^^, then pt{-) can be used to make 

13 



inference in the Bayesian way or in the frequentist way. Otherwise, pt{-) can be 
used to estimate the hmit matrix Ig. 

It has to be noted that building a contrast such that Ig^Tglg^ = I^^ is 
particularly interesting since the calculation of le and Tg is avoided. However, 
we will see below that it is not always possible. 

4 Applications in spatial statistics 

4.1 Least-square estimation of a variogram range 

This simulated case illustrates the application of the method for a real pa- 
rameter. Here, the CB-posterior distribution cannot be directly used to make 
inference but can be used for estimating Ig. 

We built a data set by simulating a centered Gaussian random field whose 
variogram is '^g{r) = 1 — exp(— ^r) with ^ = 1; ^ is the inverse of the range 
parameter. The field was simulated over a. n x n square grid {n = 20) with inter- 
node distance one. Figure [1] (left) shows the simulated random field. The sample 
variogram ^{h) was estimated for every possible inter-points distance h less than 
the half diagonal of the grid; let H denote the set of these distances. 

For the estimation of 6, we chose a uniform prior density over [0, 4] (horizontal 
dotted line in Fig. [H right) and we used the least-square contrast introduced in 
section [272] (see eq. ([1])). The CB-posterior density is shown in Figure [1] (right, 
dotted curve). The MAP estimate is Of = 1.34 (vertical line). 

Estimation uncertainty was assessed by estimating the limit variance of 6t 
which is Vg/{nIgY. The term Vg = lim^^oo V6)(v^gradf/j(^^)) (t = here) was 
estimated based on Monte-Carlo simulations: 1000 Gaussian random fields were 
simulated under Of., for each simulation the sample variogram {'y{h) : h G 7i} was 
computed, and the first derivative of the contrast in 9t, i.e. — J2hen he^^^^{^{h) — 
(1 — e"^*'')}, was calculated; the sample variance of the derivatives multiplied by 
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5 10 15 20 1 2 3 4 



e 

Figure 1: Left: realization of a centered Gaussian random field with exponential 
variogram parameterized by ^ = 1, over a 20x20 square-grid. Right: prior density 
(horizontal dotted line), contrast-based posterior density (dotted curve), density 
function of the limit distribution Af{9t,TQ/ (nlgY) (continuous and dashed lines 
when the estimate of the limit variance is based on simulations and when it is 
based on the posterior distribution), and MAP estimator (vertical line). 

7 gave the estimate 1.97 for Fg. 

8 The term Ig = limt_^oo iiUt{d) was estimated in two ways: with the estimator 

9 2'n'pt{6t)'^ /t as suggested in section 13.41 and with Monte-Carlo simulations. In 

10 the former way, the estimate of Ig is 0.20. The second way was carried out as 

11 follows: for each of the 1000 simulated Gaussian fields mentioned above, the 

12 second derivative of the contrast in 6t, i.e. Xl/iew ^^^~^''^[^~^''* "~ ~ ~ 

13 e~^*^)}], was computed; then, the sample mean of these derivatives gave the 

1 estimate 0.27 for Ig. 

2 Thus, the estimate of the limit variance rg/{nlg)'^ of 6t is 0.07 when Ig is 

3 assessed by simulations and 0.12 when Ig is computed from the CB-posterior 

4 distribution. The density function of the hmit distribution J\f{9t,Tg/{nIg)'^) is 
6 drawn in Figure [1] (right). The continuous and dashed lines show this density 
6 when the estimate of the limit variance is 0.07 and 0.12, respectively. The true 
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value 6 = 1 belongs to the 95%-confidence interval whatever the estimate of the 
limit variance is. We see how the two versions of the limit density are different 
from the CB-posterior density. 

To assess the efficiency of the method, the coverage rate of the 95%-confidence 
interval was measured by applying the estimation procedure to 1000 simulated 
fields. The coverage rate is 94.6% when the estimate of Ig is based on Monte-Carlo 
simulations and 94.7% when the estimate of le comes from the contrast-based 
posterior density. 

4.2 Pseudo-likelihood estimation of a Markovian spatial 
model 

This simulated case illustrates the application of the method for a bivariate 
parameter. Here, the CB-posterior distribution is close from the limit distribution 
of the estimator. Here also, this posterior distribution cannot be directly used to 
make inference but can be used for estimating Iq. 

We built a data set by simulating the spatial Markov field with two states, 
and 1, specified in section [2721 The field was simulated on a n x n square grid I 
{n = 20). Figure [2] (left) shows a simulation of this field for = and 62 = 0.3. 
To estimate 61 and 62, we applied the estimation method proposed in this article 
by using the pseudo-likelihood contrast introduced in section [272] (see eq. ([2])) and 
a uniform prior density over [—1.5, 1.5]^. The CB-posterior density is shown in 
Figure [2] (center). The MAP estimate is Ot = (-0.21,0.38). 

For providing the limit distribution J\f{9t, Iq^TqIq^ /n^) of the estimator, ma- 
trices Tg and Ig must be estimated. We computed the gradient and the Hessian 
of the contrast for = 1000 Markov fields simulated under 6t, and we used the 
sample variance of the gradients for estimating and the sample mean of the 
Hessians for estimating Tg. The estimate of the limit variance matrix I^^V gl^^ / 11? 
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was finally 

' 0.14 -0.055 
-0.055 0.022 

Almost the same matrix was obtained when we estimated Iq by fitting a normal 
density to the CB-posterior density as suggested in section 13.41 Figure [2] (right) 
shows the limit density function of the estimator together with the 95%-confidence 
zone. We can see that the true parameter belongs to this zone. Moreover, Figure[2] 
shows the limit density is quite close from the posterior density. The pseudo- 
likelihood which accounts for short- distance interactions certainly brings almost 
the same information than the likelihood brings. It has however to be noted that 
this would not be the case if long-distance interactions had been introduced in 
the spatial Markov model. 




I -1.5 




Figure 2: Left: realization of a Markovian spatial process with two states over 
a 20x20 grid. Center: contrast-based posterior density. Right: limit density 
^{Ot^lQ^TelQ^/n^). On the center and right panels, the MAP estimate and the 
true parameter are drawn with a black dot and a circle, respectively. On the right 
panel, the continuous line circumscribes the 95%-confidence zone. 



4.3 Estimation of an autosimilar model using moments 

This real study-case illustrates the application of the method for a bivariate 
parameter. Here, the CB-posterior distribution can be directly use to make 
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inference. 

In this section we aims to build and fit a model for soil roughness. Soil rough- 
ness plays an important role in the distribution of rain water into infiltration, 
pond and streaming. It also modifies reflectance properties of soils used to esti- 
mate soil moisture with remote detection for example. An experiment was carried 
out to measure soil roughness at a small scale. S oil heights were meas ured every 



19951). Figure [3] 



2mm along 1.18m-transects in a cultivated field (iBertuzzi et al. 
(top) shows the distributions of heights for two among twelve sampled transects. 
These distributions were obtained after subtraction of the trend estimated with 
a kernel smoothing. The mean height computed from the 12 transects is 7.6mm, 
the maximum is 22.9mm. Several models have been proposed to describe sol 



surfa c e. For instance, in Boo. 



1995 



Goulard and Chadoeul . 



ean models and au t osimi lar models (IBertuzzi et al 



1994 



Lantuejoul 



20021 . chap. 14), basic random 



elements (e.g. cylinder) are drawn from a given law and the soil surface is the 
maximum height in the former model and the summed height in the latter model. 

Here, we aim to estimate the parameters of an autosimilar model based on 
random cylinders, each cylinder having same height and radius. For any x G 
and r > 0, let f{x,r) = rl{||x.||<r} be the function describing the cylinder which is 
centered in x and whose radius and height are equal to r. In addition, let {X, R) 
be a marked Poisson point process defined over IR^ x [R+* with intensity function 
/i(x, r) = Q;exp{— /?r}. The random surface Y representing the soil surface is 
defined by 

Ym= Y1 fi^-M,r). 

ix,r)£{X,R) 

For such a process, it is difficult to calculate the joint distribution of the 
heights whereas the moments can easily be written. The parameter vector 6 = 
{a, (3) has two components and we propose to estimate it using the first two 
moments: fiA = (rpy Ja YudM, ^^p^ Y'^jdM), where A is the set of the sampled 
transects and z^(v4) is its measure. 
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If border effects are neglected, the expected value of jlA is 
^W = (6vr-,36vr^- + 24vr- 
Moreover, the variance matrix of fiA satisfies 

where the components of V are 



V22 = 7!-- + {(6!)1287r + (10!)32«:}— + (3!)(5!)128 



TT 



14' 



with K, = (arccos(M) — u\/l — M^)(arccos(f ) — v\/l — f ^) (^^^i dudv. 

The estimation method is applied by using a uniform prior over [1,100] x [1,5] 
and a contrast based on the weighted least squares of the first two moments: 

UA{e) = {flA - E{^a))'V-\^a - E{fiA))l2. 

For this contrast, the matrices I0 and Tq are equal and their component (i, j) is 

dE{fiAy_y-idE{fLA) 



Consequently, Iq^TqIq^ = Iq^) and the CB-posterior density can be used as an 
approximation of the limit density of the MAP estimator 9 a or as a posterior 
distribution of the parameter 6 (see section [331) . Figured (bottom) shows the 
joint CB-posterior distribution and the marginals. The MAP estimate of 9 is 
Oa = (46.6,3.28). Marginal 95%-confidence intervals of a and jd are [36.1,58.5] 
and [3.07,3.48], respectively. 



5 Discussion 



We have studied a method of estimation exploiting a contrast-based posterior 
distribution (CBPD). This method includes the classical likelihood-based proce- 
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Figure 3: Top: distribution of heights for two transects (heights were corrected 
by kernel smoothing for subtracting the trend). Bottom left: contrast-based 
posterior density for {a, f3); the MAP estimate is at the black dot. Bottom center 
and right: contrast-based posterior marginal densities for a et (3 (continuous 
lines) and prior marginal densities (dashed lines). 

7 dures (MLE and Bayesian estimation), but has been mainly developed to cir- 

1 cumvent difficulties encountered with the likelihood by generalizing t he Bayesia n 

2 formula of the posterior distribution, so extending the proposal of iLinI (120061 ). 

3 The CBPD can be used to make frequentist inference and, in specific situations, 

4 Bayesian inference. In case of frequentist inference, the use of the CBPD allows 

5 the reduction of analytical calculations usually required to compute the limit 

6 variance matrix of the estimator. In this article, the method has been applied 
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to spatial data sets, but can be applied to other cases where likelihood-based 
procedures are not appropriate. 

In the frequentist viewpoint, the CBPD can be used to provide a point es- 
timator (the posterior mode) and the limit distribution of this estimator. The 
limit distribution is directly approximated by the CBPD if the variance of the 
gradient vector of the contrast is equal to the inverse of the limit Hessian ma- 
trix of the contrast (i.e. Tg = I^^; see the third application). In this case, it is 
not required to calculate and estimate the variance matrix of the estimator. In 
other cases, the limit distribution is not directly available, but the Hessian ma- 
trix of the contrast can be easily estimated from the CBPD and, consequently, 
the calculation of the second derivatives of the contrast is avoided (see the first 
two applications). It has to be noted that using Bayesian calculation to make 



19 frequ e ntist estimation has been 



1996 



Robert and Titterington 



prop o sed in the literature 



1998 



Jacquier et al. 



(IRobert and Hwang 



20071 ). but the proposals 



were restricted to maximum likelihood estimation. 

In the Bayesian viewpoint, the CBPD can be used as a classical posterior 
distribution when Tg = I^^, as in the third application. It has however to be noted 
that the CBPD does not always coincide with the classical posterior distribution. 
The CBPD has to be viewed as a posterior distribution based on the information 
brought by the contrast which is used. 

Even if the proposed procedure has advantages, it also faces two classical 
limits: the choice of the prior distribution (or the penalization function in the 
frequentist viewpoint) which can influence the posterior in ference, and the choice 



2 of the cont rast. Regarding the 



(119981) and 



brmer limit, we refer to 



Clarke and Gust af son 



Rootzen and Olssod (120061 ) for example. Regarding the choice of the 
contrast, we have two comments. The first comment concerns the possibility to 
build a contrast such that Tg = I^^ (case where our method is the most in- 
teresting). It was possible in the real case-study because we could provide the 
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7 analytical form for the variance matrix of the sample moments. However, it was 

8 not possible in the two simulated case-studies. Indeed, for the estimation of the 

9 range parameter, we should have modeled the variance of the variogram. How- 

10 ever, such a practice is not common in geostatistics when the field is not assumed 

11 to be Gaussian. For the estimation of the spatial Markov model, the spatial 

12 dependences make impossible to get a transformed pseudo-likelihood such that 



13 i 



In^: it has to be noted that t 



l e pro blem of dependence can be circum- 



vented with coding techniques (IBesag 



19751 ) but, with such techniques, a part of 



15 the information is lost. This leads us to our second comment about the inf or ma- 
le tion brought by contrasts. We see that in the real case-study the two estimators 

17 are strongly correlated. We could have tried to use another contrast to avoid 

18 correlation. For example, together with the sample mean, we could have used 

19 the covariance at a given distance instead of the variance to get two moments 

20 which are less correlated. However, the calculation of the expected value and the 

21 variance-covariance of these moments is much more tricky. Thus, to be able to 

22 derive analytical expressions and apply the method as it is presented, the choice 

23 of the contrast is limited. Nevertheless, simulations could be used to circumvent 



24 this difficulty as in approximate Bayesian computation (iBeaumont et all . |2002| ). 

25 This could be an interesting extension of the estimation method proposed in this 
1 paper. 
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